Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 1 de 1
Filtrar
Añadir filtros

Base de datos
Tipo del documento
Intervalo de año
1.
medrxiv; 2024.
Preprint en Inglés | medRxiv | ID: ppzbmed-10.1101.2024.01.03.24300797

RESUMEN

IntroductionCOVID-19 can rapidly lead to severe respiratory problems and can result in an overwhelming burden on healthcare systems worldwide, making it imperative to identify high-risk patients and predict survival and need for intensive care (ICU). Most of the proposed modes are not well reported making them less reproducible and prone to high risk of bias. MethodsIn this study, the performances of seven classical machine (Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), k-Nearest Neighbor (KNN), XGBoost, Linear Discriminant Analysis (LDA) and Gaussian Naive Bayes (NB)) and two deep leaning models (Deep Neural Network (DNN) and Long Short-Term Memory (LSTM)) in combination with two widely used feature selection methods (random forest and extra tree classifier) were investigated to predict "last status" representing mortality, "ICU requirement", and "ventilation days". Fivefold cross-validation was used for training and validation purposes. In each fold, 80% data were used for training the models and the rest 20% were preserved for validation. To minimize bias, the training and testing sets were split maintaining similar distributions. Before splitting, k-nearest neighbour (KNN) imputation algorithm was employed to resolve the issue of missing data. On the other hand, bootstrapping technique was used for both oversampling and undersampling to address the issue of data imbalance. Publicly available 122 demographic and clinical features of 1384 patients were used. The performances of the models were evaluated using accuracy, sensitivity, specificity, and AUC (Area Under the Curve) of Receiver operating characteristic curves (ROC). ResultsOnly 10 features out of 122 were found to be useful in prediction modelling with "Acute kidney injury during hospitalization" feature being the most important one. Blood pH presents a decent discrimination capability especially in predicting "ICU requirement", and "ventilated days", Whereas gender and age are found to be vital in predicting "last status". It was observed that selecting more than 10 features lower the prediction accuracy. The performances of different algorithms depend on number of features and data pre-processing techniques. LSTM with the with balanced data and 10 features performs the best in predicting "last status" as well as "ICU requirement" with an average of 90%, 92%, 86% and 95% accuracy, sensitivity, specificity, and AUC respectively. DNN performs the best in predicting "Ventilation days" with 88% accuracy. For "ICU requirement" which is a binary prediction task, data pre-processing technique does not have any influence in making prediction and performances of different methods are comparable (89%, 98%, 78% and 95% accuracy, sensitivity, specificity, and AUC respectively). However, the number of features selected vary with data pre-processing technique. ConclusionConsidering all the factors and limitations including absence of exact time point of clinical onset, LSTM with carefully selected features can accurately predict "last status" and "ICU requirement" with approximately 90% accuracy, sensitivity, and specificity. DNN performs the best in predicting "Ventilation days". Appropriate machine learning algorithm with carefully selected features and balance data can accurately predict mortality, ICU requirement and ventilation support. Such model can be very useful in emergency and pandemic where prompt and precise decision making is crucial.


Asunto(s)
COVID-19 , Enfermedades Renales , Trastornos de la Memoria , Insuficiencia Respiratoria
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA